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The  coverage  probability  is  defined  as  the  probability  of  a 
sequence  of  random  variables  falling  into  a  given  region 
simultaneously.    Let  {X^.  t=l,2,  ...  ,k}  be  a  sequence  of  random 
variables  of  size  k  and  Bt={(a^.bt),  t=1.2,  ...  .k}  be  a  sequence  of 
intervals  in  the  real  line.    The  exact  value  of  the  coverage 
probability.  Pr{X^eQ^,  t=l,2.  .  .  .  ,k}.  is  usually  difficult  to 
compute  when  the  X^'s  are  not  independent  of  each  other.    In  the  first 
part  of  this  dissertation,  a  computational  method  to  evaluate  the 
coverage  probability  at  any  required  accuracy  when  {x^}  is  a  first- 
order  autoregressive  normal  process  or  when  {X^}  is  the  prediction 
error  sequence  of  that  is  derived.    A  Monte  Carlo  study  to  estimate 
Pr{X^eB^,  t=l,2,  .  .  .  ,k}  was  conducted  to  confirm  the  computational 
technique.    The  results  are  satisfactorily  close  to  each  other. 
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CHAPTER  I 
INTRODUCTION 


1.1    Statement  of  the  Problems 

In  this  dissertation  two  slightly  related  topics  in  time  series 
are  studied.    They  are  related  because  both  of  them  deal  with  time 
series  model  building.    However,  they  are  only  slightly  related 
because  one  deals  with  model  validation  in  the  time  domain  and  the 
other  deals  with  model  building  in  the  frequency  domain. 

The  first  problem  occurs  in  time  series  when  one  wants  to  vali- 
date the  model  after  he  has  already  picked  a  time  series  model  and 
considers  it  a  good  approximation  to  the  true  time  series.    A  reason- 
able way  to  do  this  is  to  compute  a  sequence  of  prediction  values  and 
their  corresponding  100  (l-a)%  confidence  intervals  and  to  see  whether 
the  proportion  of  future  observations  to  fall  into  these  intervals 
agrees  with  what  the  confidence  intervals  predict.    Since  the 
confidence  intervals  are  computed  separately,  the  probability  (l-o) 
just  applies  to  each  individual  forecast  but  not  jointly  to  the  fore- 
casts at  the  different  lead  times.    It  is  natural  to  ask  what  is  the 
probability  that  the  future  time  series  stays  within  all  the  confi- 
dence intervals  simultaneously.    We  will  call  this  probability  the 
coverage  probability.    The  coverage  probability  can  also  be  used  to 
find  the  first  passage  time  distribution  for  a  stationary  process.  A 
computational  method  to  compute  the  coverage  probability  for  some 
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special  cases  of  stationary  time  series  is  the  main  theme  of  the  next 
two  chapters. 

The  second  problem  occurs  when  one  is  dealing  with  random 
fluctuations  in  time  series.    In  analyzing  time  series  data,  one 
probably  would  like  to  ask  himself  the  following  questions  before  he 
starts  to  fit  a  model  to  the  data  in  order  to  get  a  good  result: 

1.  Does  this  set  of  data  show  some  periodical  trend? 

2.  What  kind  of  trend  is  it? 

3.  How  shall  I  deal  with  the  trend  if  it  exists? 

The  answer  to  question  1  is  yes  quite  often  because  time  series 
data  more  or  less  show  regular  fluctuations.    These  up-and-down 
movements  (or  trends)  may  be  strictly  periodic  or  nearly  so.  They 
might  fluctuate  irregularly  but  always  keep  the  same  period  in  each 
cycle.    Another  possible  type  of  irregular  fluctuation  is  that  its 
period  varies  from  cycle  to  cycle.    Hours  of  sunshine  per  day  is  an 
example  of  the  first  type.    Examples  of  the  second  type  are  economic 
time  series  such  as  weekly  total  sales  in  which  the  fluctuations 
reflect  or  constitute  the  business  cycle  or  physical  time  series  such 
as  weekly  temperatures  in  which  the  fluctuations  come  from  the  change 
of  the  season. 

In  analyzing  data  with  trend  of  the  first  or  second  type,  one 
always  de trends  the  data  by  taking  the  differences  between  the  data  or 
decomposing  the  series  into  a  sum  of  a  deterministic  periodical  func- 
tion such  as  some  trigonometric  functions  and  some  random  component. 

Some  examples  of  time  series  with  the  third  type  of  trend  are 
sunspot  data  and  the  speech  wave-form.    This  kind  of  data  does  possess 
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some  cyclical  trend  but  its  frequency  varies.    Thus,  we  call  it 
jittery  cosine  wave.    The  term  jittery  comes  from  linguists  and  the 
term  cosine  wave  comes  from  what  it  looks  like.    The  methods  mentioned 
in  the  preceding  paragraph  are  not  adequate  to  analyze  this  kind  of 
data  because  these  two  methods  require  constant  period  from  time  to 
time.    In  order  to  give  some  idea  about  how  to  model  this  kind  of 
data,  we  propose  a  new  time  series  model  which  allows  the  period  to  be 
different  from  cycle  to  cycle. 

1.2    Literature  Review 
Let  {x^,  t  >  1}  be  a  stationary  time  series  and      =  (a^,b^)  t  = 
1,2,  ...  be  a  given  sequence  of  intervals  in  the  real  line.  Also, 
let  the  coverage  probability  P|^  be  defined  as 

P,^  =  Prlx^  e        ;  t  =  1,2,.. .,k}  .  (1.2.1) 

The  computation  of  Pj^  seems  to  be  an  open  problem  since  the 
unsuccessful  attempt  by  Epstein  {1949,  1951).    It  is  related  to  the 
general  category  of  the  extreme  values  and  the  first  passage  time 
problems.    Surveys  of  recent  development  and  applications  of  extreme 
values  can  be  found  in  Gumbel  (1958),  Hann  (1976),  Galambos  (1978), 
and  David  (1970).    When  the  X^'s  are  independently  and  identically 
distributed  random  variables  with  a  known  distribution,  the  values  of 
P|j  can  be  easily  calculated,  and  when  X^'s  are  independent  and 
identically  distributed  with  an  unknown  distribution,  the  asymptotic 
property  of  P|^  evaluated  at  a^  =  -»  and  b^  =  b  for  t  =  1,2,  .  .  .  ,  k 
can  be  proved  to  belong  to  one  of  three  categories  of  distributions 
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for  one-sided  identical  intervals  Bj^  =      =  .  .  .  =  (-»,b),  or 
(a,»).    Although  a  similar  asymptotic  theory  can  be  derived  for  corre- 
lated      (see  e.g.,  Watson,  1954;  Berman,  1964,  1971;  Leadbetter, 
1974;  Leadbetter  and  his  colleagues,  1978,  1979),  little  is  known 
about  the  nonasymptotic  properties  of  P^^.    As  one  might  expect,  while 
the  properties  of  P|^  differ  very  little  asymptotically  between  inde- 
pendent and  correlated  processes,  they  may  be  quite  different  for 
small  k.    Hunter  (1977)  has  found  a  bound  for  P|^  under  the  first-order 
autoregressive  model  by  his  modified  Bonferroni  inequality  combined 
with  the  results  given  by  Gupta  (1963)  and  Slepian  (1962),  but  it 
seems  difficult  to  improve  his  bounds  further  if  the  error  of  their 
bounds  is  greater  than  the  required  accuracy.    The  first  passage  time 
problem  has  received  even  more  attention  in  the  literature.    A  survey 
of  this  field  can  be  found  in  Blake  and  Lindsey  (1973)  where  133 
references  are  cited. 

Coverage  probability  can  also  be  used  to  check  the  reliability  of 
using  prediction  intervals  for  time  series  model  validation.  Many 
authors  (e.g..  Box  and  Jenkins,  1976,  p.  137-138,  146;  Anderson,  1975, 
p.  122;  and  Nelson,  1973,  p.  153)  use  prediction  intervals  to  validate 
their  model  by  checking  whether  the  future  observations  fall  into 
these  intervals.    But  little  is  known  about  their  real  coverage  pro- 
babilities.   We  solved  this  problem  at  least  for  any  first-order  auto- 
regressive model  with  positive  autoregressive  coefficients.    Since  the 
prediction  intervals  are  in  general  two-sided  and  nonidentical ,  it  is 
not  known  whether  similar  bounds  can  be  easily  found  by  Hunter's 
method. 
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Wolfer's  yearly  sunspot  data  have  been  studied  by  many  authors. 
A  variety  of  models  which  may  be  based  on  different  time  periods  have 
been  suggested  since  1927.    A  summary  of  these  works  is  given  by 
Woodward  and  Gray  (1978),  where  nine  different  autoregressive-moving 
average  models  (ARMA)  can  be  found.    In  their  paper,  three  models 
suggested  by  them  are  compared  with  and  shown  superior  to  a  second 
order  autoregressive  model  which  was  suggested  by  Box  and  Jenkins 
(1976)  and  Moran  (1954).    Although  Woodward  and  Gray  claim  that  their 
models  are  better  than  other  ARMA  models,  they  admit  that  probably  the 
most  satisfactory  method  of  modeling  the  sunspot  series  would  be  one 
which  utilized  the  actual  physical  mechanisms  at  work.  Without 
knowing  what  the  actual  mechanisms  are,  we  tried  to  find  a  model  which 
provides  an  improved  fit  to  the  data.    A  model  which  is  a  product  of 
two  jittery  cosine  waves  is,  therefore,  proposed. 

Speech  wave-form  has  been  studied  by  many  linguists  and  acousti- 
cians.   A  vocal  jitter,  which  is  a  laryngeal  phenomenon  defined  as  the 
cycle-to-cycle  variation  found  within  successive  periods  of  a  laryngeal 
vibratory  pattern,  makes  the  speech  wave-form  fail  to  keep  the  fixed 
frequency  from  cycle-to-cycle.    Horii  (1975,  1979)  analyzed  this  kind  of 
data  by  using  a  peak-picking  method  which  is  similar  to  the  one 
described  by  Gold  (1962)  to  obtain  sample  frequencies  and  used  them  to 
estimate  the  mean,  median  and  standard  deviation  of  voice  fundamental 
frequencies.    Since  Gold's  peak-picking  method  did  not  use  any  modelling 
techniques  and  just  used  a  computer  program  to  select  the  maxima  and 
minima  of  the  wave-form,  it  would  be  desirable  to  incorporate  a 
statistical  model  in  analyzing  the  fundamental  frequencies. 
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1.3    Sketch  of  This  Dissertation 
In  Chapter  II,  a  computational  method  is  developed  to  compute 
of  a  first-order  autoregressive  process  to  any  required  accuracy.  Of 
course,  the  greater  the  accuracy  is  required,  the  more  computation  is 
necessary.    However,  the  computation  time  is  approximately  linear  in  k 
for  any  fixed  accuracy.    This  method  then  will  be  used  to  compute  the 
coverage  probability  of  the  prediction  errors  of  a  first-order  autore- 
gressive process.    One  hundred  and  twenty  coverage  probabilities  of 
the  various  numbers  of  prediction  errors  of  first-order  autoregressive 
process  with  12  different  autoregressive  coefficients  are  computed  by 
using  our  method  in  Chapter  III  as  an  example.    Since  the  true  cover- 
age probabilities  are  unknown,  we  also  conducted  a  Monte  Carlo  study 
in  Chapter  III  to  compare  with  the  computational  results.  (Some 
applications,  such  as  first  passage  time  distribution,  are  also 
discussed  in  this  chapter.) 

A  new  model  which  contains  a  random  frequency  shift  in  a  cosine 
function  for  a  stationary  time  series  is  introduced  and  studied  in 
Chapter  IV.    A  set  of  consistent  estimators  are  also  given  in  Chapter 
IV.    In  Chapter  V,  we  state  some  future  research  in  this  area  which 
includes  the  extension  of  this  model  to  a  more  general  case  and  the 
construction  of  a  more  complicated  model  to  fit  the  Wolfer's  yearly 
sunspot  data.    Our  model  is  shown  better  than  those  models  suggested 
by  Woodward  and  Gray  (1978)  because  it  can  fit  the  sample  autocorre- 
lations of  sunspot  data  much  better. 


CHAPTER  II 
COVERAGE  PROBABILITY  FOR  AR(1)  MODEL 


2.1    General  Asymptotic  Results 

Let  M   =   max  {x. }.    Then  the  coverage  probability  P   defined  in 
l<i<n  " 

(1.2.1)  evaluated  at      =  (-«,b)  for  a  real  number  b  and  for  t  =  1,2, 
.  .  .  ,n,  referred  to  as  P^^j^  from  now  on,  is  the  same  as  Pr(M^  <  b) 
which  is  an  extreme  value  distribution  function.    The  classical 
extreme  value  theory  which  deals  with  the  asymptotic  distribution  for 
the  maximum  of  independent  and  identically  distributed  (i.i.d.)  random 
variables  has  been  studied  for  at  least  half  a  century.    However,  some 
special  cases  may  reach  further  back  to  mathematical  antiquities.  For 
example,  as  early  as  in  1709,  Nicolous  Bernoulli  considered  the 
following  actuarial  problem:    Assume  n  men  of  equal  age  will  die 
within  t  years;  what  is  the  mean  duration  of  life  of  the  last 
survivor? 

The  classical  results  about  extreme  value  distribution  can  be 
found  in  many  articles.    For  example,  in  Galambos'  book  (1978),  it  is 
stated  as  follows: 

If  Xj^,  X2,  .  .  .  are  i.i.d.  random  variables,  and  if  for  some 
real  sequences  a^  >  0,  b„,  the  normalized  maximum  a^(Mn-b^)  has  a  non- 
degenerated  limiting  distribution  function  G(x),  then  G(x)  has  one  of 
the  following  forms: 
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TYPE  I:       G(x)  =  exp(-e"^)  -»  <  x  <  »  , 

TYPE  II:      G(x)  =0  x  <  0 

=  exp{-x  ")       (for  some  a  >  0)        x  >  0  , 
TYPE  III:    G(x)  =  exp(-(-x)")    (for  some  a  >  0)        x  <  0 
=  1  X  >  0  . 

It  is  known  that  if  Xp        .  .  .  are  independent  and  identically 
distributed  normal  random  variables  with  zero  mean  and  unit  variance, 
then  the  asymptotic  distribution  of      is  of  type  I,  i.e., 

P>^(an(M„-b^)  <  x)     exp(-e"'^)  ,  as  n  -»•  »  . 
where 

V2 

^n  =  (21og(n))  (2.i.i) 

\  -\ 
=  (21og  n)    -0.5(21og  n)       {log  log  n  +log(4Tr)}    .  (2.1.2) 

Berman  (1964)  extended  this  result  to  a  stationary  normal  sequence  and 
showed  that  under  certain  conditions,  the  extreme  value  of  a 
stationary  normal  sequence  has  the  same  limiting  distributions  as  in 
the  case  of  i.i.d.  random  variables.    His  results  are  given  in  the 
following  theorem. 

Theorem  2.1.1.    Let  {X^,  n=l,2....}  be  a  stationary  normal  sequence 
wi  th 
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EX^  =  0,  EX^  =  1  (2.1.3) 

for  all  n  and  with  an  autocovariance  sequence  {r^,  k=l,2,...} 
defined  by 

=  k=l,2,....  (2.1.4) 


If  {r^]  satisfies 


lim  rjog  n  =  0  (2.1.5) 
n-Ho 


or 


S        <  -  (2.1.6) 

n=l  " 


then      =  max{X^,X2,...,X^}  satisfies 

Pr(a^(M^-b^)  <  x)  ->■  exp(-e"^)  ,    as  n  -»■  »  , 

where  a^  and      are  defined  as  in  (2.1.1)  and  (2.1.2). 
This  theorem  will  be  used  later  to  find  the  asymptotic  distribution 
for  the  maximum  of  a  stationary  normal  ARMA(p,q)  process.  The 
definition  of  an  ARMA(p,q)  process  and  all  the  details  are  given  in 
the  next  section. 


2.2    Asymptotic  Result  for  Stationary  Normal  ARMA(p,q)  Process 
A  mixed  autoregressive-moving  average  (ARMA)  process  {X^}  is 
defined  as 
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X+  =  (t),X^.  X^.  „+a.-9,a^.  ,-...-9  a^.  „  (2.2.1) 

t      1  t-1         p  t-p    t   1  t-1         q  t-q 

where  the  (t)'s  and  9's  are  real  constants;  p,q  are  two  nonnegative 
integers  and  {a^}  are  i.i.d.  normal  random  variables.    Moreover,  the 
future  noise  a^   is  independent  of  the  past  X^  for  t  <  tg  for  all 
tg.    In  other  notation,  let  B  be  the  backward  shift  operator  which 
means  BX^  =  '^■^-l  ^* 

(l-<l>^B-<t)2B^-...-tpBP)X^  =  (1-9^8-828^-... -0qB^)a^ 

or 

*(B)X^  =  9(B)a^ 

where  "KB)  and  9(B)  are  polynomials  of  degree  p  and  q  in  B. 

An  even  simpler  notation  for  process  (2.2.1)  is  ARMA(p,q).  If 
the  equation  <ti(B)  =  0  has  all  its  roots  lying  outside  the  unit  circle, 
then  the  process  defined  in  (2.2.1)  is  a  stationary  process. 

Now,  let  the  X^'s  satisfy  condition  (2.1.3)  and  r^  be  defined  in 
(2.1.4).    Then  the  autocovariance  function  satisfies 

'"k  =  Vk-l^'-'-^Vk-p^^Xa^^^-Vxa^^-^^-'-'-Vxa'^-^^  ^^.2.2) 

where  rj(g(k)  is  the  cross  covariance  function  between  X  and  a  and  is 
defined  by 

r^^M  -  E(X,_,a,)  . 

By  the  independence  of  the  future  {aJ  and  the  past  {xJ.  equation 
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(2.2.2)  implies  the  difference  equation 

^  =  *l'"k-l'*2V2^----^p'"k-p  ^^^^1  (2.2.3) 

or 

♦  (B)r,^  =  0  k>q+l 

with  initial  conditions  {r,^,  k=l,2....,q}  which  depend  on  the  q  moving 
average  parameters  (91,02, ... ,9q)  ^"'^        P  autoregressive  parameters 
('i>l><t'2»««'.<l'p)-    The  general  solution  of  (2.2.3)  is 

^  =  ^;Pm,-l(")  -^^sPm  -l(") 

i  s 

where  <|)(B)  =  0  has  roots 

with  multiplicity  m^^, 

with  multiplicity  m^, 
m.m^+  . . .  +  m„  =  P  and 

i     c  S 

P^.  (n)  is  an  arbitrary  polynomial  of  order  i  in  n. 
It  can  be  shown  that  for  a  stationary  process,  G^-'s  satisfy 
IG^-I  <  1  i=l,2,...,p  . 


Therefore,  if  {x^,  n=l,2,...}  is  a  stationary  normal  ARMA(p,q)  process 
satisfying  condition  (2.1.3),  then  it  can  be  proved  that 
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Z  r 
k=l 


I  =    I  (G^P^    ,(n)  + 


Thus,  the  condition  (2.1.6)  of  Theorem  2.1.1  is  satisfied.  Hence,  we 
can  conclude  that 


where  a^  and      are  defined  as  in  (2.1.1)  and  (2.1.2).    The  coverage 
probability  Pp  of  a  stationary  normal  ARMA(p,q)  process  can  be 
approximated  through  (2.2.4)  when       =  (-»,b)  and  n  is  large.    But  can 
(2.2.4)  be  used  for  small  n?    To  answer  this  question,  a  Monte  Carlo 
study  for  one  of  the  simplest  models  described  in  (2.2.1),  by  letting 
p  =  1  and  q  =  0,  known  to  as  AR(1)  process,  is  conducted.    The  study 
is  done  by  generating  10,000  simulations  for  each  different 
combination  of  <\>i  and  n  values.    In  each  simulation,  a  set  of  n 
normally  distributed  observations,  Xp        .  .  .  ,  X^,,  which  satisfies 
condition  (2.1.3)  is  randomly  generated  through  AR(1)  model  with 
autoregressive  parameter  <t>i.    Then  the  coverage  probability  P^,  ^ 
is  estimated  by  C/10,000  where  C  is  the  number  of  simulations  of  which 
the  whole  set  of  observation  values  falls  into  the  95%  one-sided 
confidence  interval;  that  is,  X^  <  1.645,  t=l,2,...,n.    The  result  is 
shown  in  Table  I.    From  Table  I,  we  can  see  that  when  n  is  small,  the 
coverage  probabilities  of  the  AR(1)  process  when  the  autoregressive 


Pr(a^(M^-b^)  <  x)  >  exp(-e"^)    as  n  > 


(2.2.4) 
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parameter  is  not  close  to  zero  are  quite  different  from  those 
the  i.i.d.  case.    For  example,  when  k  =  10 

P^O, 1.645  "  0-5988  at      =  0  and 

^0,1. 645  =  h  =  • 

We  can  claim  that  their  difference  is  statistically  significant 
because  two  standard  deviations  of  the  simulation  error  based  on 
10,000  independent  Bernoulli  trials  is  less  than  0.01. 


Table  I 

''n, 1.645         10»000  Simulations 
n 


4 

5 

8 

10 

0 

0.8145 

0.7738 

0.6634 

0.5988 

0.02 

0.8120 

0.7716 

0.6649 

0.6009 

0.1 

0.8186 

0.7751 

0.6646 

0.6144 

0.2 

0.8174 

0.7769 

0.6707 

0.6139 

0.5 

0.8271 

0.7961 

0.7080 

0.6671 

0.9 

0.8158 

0.8017 

0.7876 

0.7719 
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2.3    A  Computational  Method  for  Coverage  Probability 

for  AR(1)  MoHeT   

Since  the  result  at  the  end  of  the  previous  section  shows  that 
when  n  is  not  large,  (2.2.4)  cannot  be  applied  to  compute  P^,  ^, 
methods  other  than  (2.2.4)  need  to  be  explored  when  n  is  not  large. 
Hunter  (1977)  has  found  a  bound  for  P^^i^  for  an  AR(1)  normal  process 
by  his  modified  Bonferroni  inequality  combined  with  the  results  by 
Gupta  (1963)  and  Slepian  (1962),  but  it  seems  difficult  to  improve  his 
bounds  further  if  the  width  of  their  bounds  is  greater  than  the 
required  accuracy.    Moreover,  the  prediction  intervals  are  in  general 
two-sided  and  nonidentical .    It  is  not  clear  whether  similar  bounds 
can  be  easily  found  by  Hunter's  method.    In  the  rest  of  this  section, 
we  derive  a  computational  method  that  can  compute  P|^  for  any  required 
accuracy  for  an  AR(1)  normal  process  or  a  conditional  AR(1)  normal 
process  (which  will  be  defined  later)  for  both  one-sided  and  two-sided 
nonidentical  intervals.    Of  course,  the  greater  the  accuracy  is 
required,  the  more  computation  is  necessary. 

We  first  define  the  conditional  AR(1)  normal  process  as  follows: 
Let      be  a  Markov  process  satisfying  Xq  =  0  and 

^t  "  *^t-l'^^t  •        t=l,2,...  (2.3.1) 

where  \^\  <  1  and  {e^}  is  a  sequence  of  i.i.d.  normal  random  variables 
with  zero  mean  and  unit  variance.    Also  let      =  (a^,b^),  t=l,2,..., 
be  a  given  sequence  of  intervals  in  the  real  line.    In  this  section. 


P,^  =  Pr(X^  e  B^  ;  t=l,2,....k) 
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is  evaluated  for  small  to  moderate  k  when  <()  is  a  rational  number  of 
the  form 

<l>  =  s/r 

where  s,r  ^  0  are  integers  and  s/r  is  irreducible. 

Let  a  new  discrete  process  X^(n)  and  e^(n)  be  defined  as 

X^(n)  =  *X^_^(n)  +  e^(n)  (2.3.2) 

where  Xgin)  =  0,  and  {e|.(n)}  is  an  i.i.d.  discrete  process  with 

Pr{e^(n)  =  jr""}  =  *[( j+l/2)r""]  -  $[( j-l/2)r""].  j=0.±l..... 

where  *(x)  is  the  standard  normal  distribution  function. 

Equation  (2.3.2)  is  the  result  of  approximating      by  a  discrete 
random  variable  e^(n)  which  condenses  all  the  probability  in 
C(j-l/2)r"",  (j+l/2)r~"]  to  jr"".    To  use  this  approximation,  one 
needs  to  control  the  error 

^  =  ^''^h^^t'  t=1.2....,k}  -  Pr{X^(n)eB^;  t=1.2....,k}  .  (2.3.3) 

and  to  find  a  feasible  algorithm  to  compute  Pr{X^(n)eB^, 
t=l,2,...,k}.    The  second  problem  is  dealt  with  first.    The  apparent 
difficulty  in  finding  Pr{x^MeQ^;  t=1.2....,k}  lies  in  the  fact  that 
X^(n)  takes  more  and  more  values  for      of  the  same  length.  For 
example,  X^(n)  takes  only  values  of  the  form  3r~^H^  for  integer  j. 
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while  X2(n),  according  to  (2.3.2)  takes  values  of  the  form  jr~^""^^^eB2 
for  integer  j.    Consequently,  X|^(n)  takes  values  of  the  form 
j^-(n+k-l)gg^  for  integer  j.    Even  for  moderate  k  (e.g.,  k=10),  and 
small  n,  r  (e.g.,  n=4,  r=5),  the  storage  for  the  values  of  X|^(n)  in 
[0,1]  is  of  the  order  5^-^  =  1.22  x  10^  which  cannot  be  handled  by  most 
computers.    Because  of  the  large  number  of  values  for  X|^(n),  the  time 
required  to  compute  them  is  even  worse.    Let  P^-(n,j)  be  the 
probability  that  X^-(n)  =  jr"^""*"^"!^    Then  a  recursive  formula  from 
P-|_l(n,j)  to  P,-(n,j)  can  be  written  as 


P,(n.j)  =    ^,Pr[=,(n)=jr-'"*'-ll-*j'r-'"*'-2']p..j(„.j'),  (2.3. 


4) 


I  I 

Where  the  range  of  j    is  over  all  the  integer  j    such  that 
j'r-("^i-2)eB._^  and  Pr{e.(n)  =  jr-^^^^'-l^-^jV^^^^-^)}  =  o  if 
(n+i  l)_^j  g  (n+i  2)  ^.^  ^^^^  ^^-n        ^^^^  integer  i. 

The  total  additions  and  multiplications  for  computing  P|^  by  (2.3.4) 
behave  like  a  k-fold  multiple  integral,  i.e.,  with  total  computational 
operations  0{r^^).    This  is  obviously  unmanageable  even  for  small  k. 
The  following  theorem  will  enable  one  to  reduce  the  storage  space  as 
well  as  the  computing  time  to  0(k)  for  fixed  n  and  r. 

Theorem  2.3.1.    Let  {x^}    denote  the  time  series  defined  in  (2.3.1). 
Then 

|Pr{x^eB^^,  t=l,2,....k}  -  Prlx^BB^^,  t=1.2, . . . ,k} |  <  Ar"^2n+k-l)^ 

(2.3.5) 

where 
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r 

^It  "  ^2t  ^  t=l,2,...,k-l,  and 

B^^  =  [j/r"-  l/(2r"^^-^).j/r"+l/(2r"^^-l)]  . 

=  [j/r"-(m+V2)/r""^-l  , j/r"-(m- V2  )/r"^^-^]  . 
for  any  j=0,±l,±2, . . . ,  and  any  0  <  m  <  r^'^  . 

^roof:    Let  the  density  function  of  X  =  (X^,X2. . . . ,X,^) '  be  denoted  by 

f(X)  =  (2Tr)-'^/2,2|    ^%xp(-x'f ^X/2)  . 
where  Z  is  the  variance-covariance  matrix  of  X,  i.e., 

1  *  ...  ^^'^ 

*      iH^     *(i+<i)^)     ...  ^^'^nnh 

...  i.<^2,...,^2(k-l)  ^  ^ 

Then  it  is  known  by  Theorems  8.3.6-7  of  Graybill  (1969),  that  the  last 
column  and  row  of        can  be  expressed  by 
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1  if  i=k 

a     -a     =  {  -s/r  if  i=k-l  (2.3.6) 

0  otherwise, 

where  a^"^  denotes  the  i.jth  entry  of  i'^.  Therefore, 
|Pr{X^£B^.  t=l,2,....k}  -  Pr{X^eB2^,  t=l,2, . . . ,k} | 

=  l/B^^9(Xk)dx^-/g^^g(x^)dxJ 

<  max|g(u)-g(v)|r"^"^'^"l)  ,  (2.3.7) 

"^^Ik 
^^^2k 

where  g(x^)  =  /g  .../^     f (x)dx^_^dx^_2.. .dx^.  ^^9^^ 
side  (2.3.7)  can  be  simplified  by  noting  that 

n)ax|g(u)-g(v)  I  <   max    |g'(w  )jr~"  ,  where 


Iq  =  Cjr-"-(V2)r-^"^^-l).jr-Mm.V2)r-("^^-i)]. 


'3*^^0^l  =  'i|:^B/--/b,  /(x)dx^dX2...dx^_^| 


and 
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=  |/„  .../n      {-V2f(x)(  I  xAa^Ka^'*)+2x.a^^)(lx,...(ix.   J  . 
1       "^k-l  -    i=l  ^  k  1  k-1' 

(2.3.8) 

Substituting  (2.3.6)  to  (2.3.8),  one  has 
|g'(wQ)|<(s/r)/  .../  |x^_jf(x)dx^...dx^.^+|xj/  .../  f(x)dx,...dx.  , 

-OB  ~  -00  -OO        ~  •'■ 

oo 

=  (s/r)  /   f(x,^.pX,^)|X|^_Jdx^_^+|xJf{x,  ) 

"CO 

and  /  n\.y\)\\.^\dX^_^ 

"CO 

£  fix,)  /"  f(X,.JX,=x,)|X,.j-E(X,.j|X,)|dX,.^  * 
f(X,^)  /"  f(X^_^|X^)|E(Xj^_^|X^)|dX^.j 

"oo 

=  f(X^).   1+  f(X|^)|X|^|  cov(X^.X^_^)/var(X^) 

?  1/2 
<     f /(2n.var(X^))     +  f  (X^)  iXj  f  .(l-(i.)2('<-l)  j/(i.(lj2kj  ^ 

-1/2 

max     f(X.)|X.  I  =  (Z^^e) 
-°°<X|^<" 
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we  have 

|g'(w^)|  •     (l-(f)2)/(l-{f)2^)  + 

-  V2 

7  -(2^6)    •(i-(f)^^^'^^/(i-(f)2'')]  +  (2^6 r2  . 

This  completes  the  proof  of  Theorem  2.3.1. 

To  see  how  Theorem  2.3.1  can  be  used  to  reduce  storage  and 
computation,  we  let      =       .  .  .  =  B|^  =  [0,1]  to  simplify  the 
illustration.    Without  Theorem  2.3.1,  one  needs  all  the  probability 
values  for  X|^(n)  at  points 

in  --(n+k-1)  -  -(n+k-1)  .  -(n+k-1)  ,1  . 
lO'i^  .2r  .....jr  ,...,1}  (2.3.9) 

to  approximate  P,^  by  (2.3.4).  But  with  Theorem  2.3.1,  one  needs  only 
to  compute  X^^in)  at 

{0.r'",2r'",....jr"",....l}  (2.3.10) 

and  approximate  all  the  rest  of  the  probabilities  at  points  in  (2.3.9) 
by  the  probabilities  at  points  in  (2.3.10).    To  compute  the 
probabilities  of  points  in  (2.3.10)  by  (2.3.4),  note  that  in  (2.3.4) 
e|^(n)  takes  only  values  of  the  form  ir""  with  integer  i.    Thus,  for 
the  point  Jr""  in  (2.3.10),  one  has,  by  (2.3.4), 


1r-"  =  jr-"  -  (s/r)a'r-("^^-2)  , 

or 


(2.3.11) 
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j'  =  (j-i)r'^"Vs  . 

Because  s  and  r  are  relative  primes,  (j-i)/s  has  to  be  an  integer.  In 
other  words,  j  r'^"^"^)  has  to  be  an  integer.    Thus,  for  all 
j  r"^"'*''^"2^eB|^_-,^,  one  only  needs  to  consider  the  values 
{0,r  "'''^,2r  ""*'^,...,jr  "'*'^,...,l}eB|^_-|^.    it  can  be  shown  by  the  same 
argument  that  in  order  to  compute  all  the  probabilities  at  the  points 
in  (2.3.10),  one  needs  only  to  evaluate  the  probabilities  of  all  the 
subsequent  X,^_2(n),X|^.3(n)....,X2(n),Xi(n)  at 
{0,r~"+^2r"""^^...,jr""'^^...,l}.    Hence,  the  total  number  of 
operations  is  of  order  0(kr2""^)  and  the  storage  space  is  0{r").  For 
fixed  n,  the  computational  time  is  linear  in  k. 

Next,  the  error  defined  by  (2.3.3)  will  be  evaluated. 

Theorem  2.3.2.    Let  X^  and  X^(n)  be  defined  by  (2.3.1)  and  (2.3.2), 
respectively.    Then,  the  error  e,^  defined  by  (2.3.3)  satisfies 


|e^|  <  k(k+l)eQ/2 


where      =  Utt)'^^  {r^)''^  =  o.4r 

Before  proving  this  theorem,  we  need  to  prove  a  lemma.    In  this 
lemma,  [x]  will  denote  the  closest  integer  to  x,  e.g.,  [1.4]  =  l, 
[1.6]  i  2,  [1.5]  =  1. 

Lemma  2.3.3.    Let  {p.}  be  a  sequence  of  nonnegative  numbers  such  that 
^    p.<  1.    Let     max    |p  |  <  e„,  and  for  a  given  integer  k, 

J=-oo      J  -«.<j<<x,        J  U  "  3  » 


-n 
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c    _         2  z  z 


where  a^. ,  b^.,  al,  bl  may  depend  on  the  previous  indices 

^i-l'^i-2'***'^l'  l^i'^i-l  i  l.|b^.-bj|  <  1,  and  a^.  >  if  b^. 
for  all  i=l,2.....k.  then  |e^|  =  |S^-S^|  <  keg. 

The  proof  is  obvious  by  induction  and  by  the  general  formula 

^      S  ...     E      p.  ...p.     -    Z      Z  ...     Z    p.  ...p. 

bl    b2      b^  b-       b^  b^  b^ 

=    z    [z  ...  Z  p.  ...p    -  Z  ...  Z  p.  ...p    ]p  +  xz  ...  z  p.  .. 

Jr^l  ^2      \  ^2       \   a^      a^  h      h    h    a'       a^  ^2 

where  iX|  <  "'ax(pj^^.p^^.p^.  .p^. .  |p^^-p^.  | .  jp^. -p^  j  )  <  e^. 

Proof  of  Theorem  2.3.2. 

In  this  proof,  all  the  j^-'s  are  r""  integer;  i.e.,  they  are 
the  form  jr""  for  an  integer  j.    Let  [x]  denote  the  number  jr"" 
closest  to  X  and  A  =  r""/2.    By  the  transformation      =  ^X^.^+e^. 
t=l,2,...,k,  we  have 

Pr{X^eB^,  t=l,2....,k} 

6i    82  h 
a,    a2  a,  1 
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where      =  a^^,      -  b^^,  02  =  3i2''^^i*  ^2  "  "^a"*^!*  ^'"^  S^"^""*^ 


Define 


where 


I 

Em 


^Jl  "  ^£"*^£-r  •••  "  •  il=1.2,...,k.  (2.3. 


°k-i+l  °k 


^k-i+r\-i+l  Jk"\ 


with 


if  m  <  k-i 


if  m  >  k-i 


Then 

'«k~\'  -        »  'S|(-B|^|  <  a/2  ,  and 


24 


Ij-Sj  =  /     d*{e^)  -     Z     Pr{e^(n)  =  j^) 


8,  B, 

k  °k  1 


=  /      d$(e  )  -  /     iHe  )  <  (A/2  +  A/2)  •        T  •  r"" 
"k  \  '  ■ 


Let  1  |I,--S.  11=         sup  |I.-S.  I  and 

EE  £ 

1'  2"*"  k-1 


^  '^k-i+2  ^k-i+3 

1-1      .         _.'  .  ' 

^k-i+2  ^k-i+2  ^k-i+3^\-i+3 


I,  Pr(e^(n)=j^)...Pr(E^_.^2(n)=j^_.^2)  . 


Jk=\ 


where 


I 


Note  that  A£  and       have  a  difference  at  most  l{r""  integer)  and  i 

I  I 
Ajt  if  B£  >  Bji  and 


b+A  b 
J       h([e],...)  d*(E)  =    E    h(i....)Pr(e(n)  =  i), 
a-A  i-=a 
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for  any  function  h  and  r""  integers  a  and  b.    Thus,  for  the  general 
term 

Vi+l 

+  /  ^--i^^^Vi+l^  "^-^  ^"^  (2.3.13) 

"'k-i+l 

its  second  right  term  satisfies  Lemma  2.3.3  and  produces  |  |S^.  ^-S*  ^|| 
<  (i-De-.    The  last  term  of  (2.3.13)  can  be  expressed  as 


^k-i+1   ^  B^-i+1 


^--i^^^Vi.i) -/  Cid*(Vi.i) 


"k-i+l  \-i+l 


and  by  checking  the  integration  boundaries  it  can  be  shown  that 
|6|  <  eQ.    Thus,  (2.3.13)  becomes 


li-Sill  <  IIIi-rS,._ill  -i.e^  , 


or 


\\-\\  <  k(k+l)eQ/2  . 

The  theorem  is  proven. 

An  error  bound  linear  in  k  can  be  obtained  if  B^'s  are  one-sided 
and  ^  >  0. 
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Theorem  2.3.4.    Let      in  Theorem  2.3.2  be  (-",b^)  for  t=l,2,...,k. 
Then  for  ^  >  0,  |e^|  <  {1.5k-l)eQ. 

First,  a  lemma  will  be  proven.  In  this  lemma  [x]  again  denotes 
the  closest  integer  to  x. 

Lemma  2.3.5.    Let  F  and  g  be  two  monotonic  functions  from  (-","»)  to 

[0,1].    Let  F  also  be  continuous  and  satisfy  max  |F(x+l)  -  F(x)|  <  e 

X  ~  ^ 

for  some  constant  e^. 


Then 

,  ,a  [a-1/2] 

g(t)dF(t)  -       i:_    g{j){F(j+V2)  -  F(j-1/2)|  <  1.5  e^  .  (2.3.14) 

J 


:_oo  -  -  -Q 


Proof.    First,  we  write  the  integral  as 
,a  Ca-1/2]  j+1/2  a 

J    g(t)dF(t)  =     Z     /        g(t)dF(t)  +    /  g{t)dF{t).  (2.3.15) 

j='"   j-V2  [a-l/2]+V2 

Obviously, 
J+  V2 

/  .     g(t)dF(t)  =  m.{F(j+V2)  -  F(j-V2)} 
j-V2 

for  some  mje[g( j- V2  ) ,g( j+ V2  )].    By  the  monotonicity  of  g, 
rJ+V2 

1/  ,     g(t)dF(t)  -  g(j){F(j+l/2)  -  F(j-1/2)}| 
J- V2 


<  lg(j+V2)  -  g(j-V2)|e,  . 


0  •  (2.3.16) 
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Combining  (2.3.15),  (2.3.16).  and  the  fact  0  <  g  i  1,  we  have 
(2.3.14). 

Proof  of  Theorem  2.3.4.    As  in  the  proof  of  Theorem  2.3.2,  all  the 

j^-'s  are  of  the  form  jr""  for  integer  j,  and  [x]  denotes  the  number 

jr""  closest  to  x.  Moreover,  by  the  transformation  =  *X^.;^+e^,  we 
have 

Pr{X^eB^,  t=l,2,...,k} 

bi  b2-*e^  V*%-r--*'''^i 

=  /      /  /  d«l'(e.  )  ...  d*(ej  . 

Approximating  the  last  integral  by 


Pr(e^(n)=j^)  ,  (2.3.17) 

k 


J.  =-» 


one  can  see  that  the  difference  of  (2.3.17)  and  the  last  integral  is 
less  than  0.5  Cq  in  absolute  value.    Let  the  sum  in  (2.3.17)  be 
g(e|(_l).    Since  <l>  >  0,  g(ek-i)  is  a  monotonical ly  decreasing  function 
satisfying  the  requirement  of  g(x)  in  Lemma  2.3.5.    Thus,  if  we 
approximate 


L  Pr(^,(n)=j,)d*(e^.^) 
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by 

'•^k-l~*^k-2~' •        ^e^-'^'^k'^^k-l"' •  ^^l-' 
''k-l  Jfc 

(2.3.18) 

one  can  see  from  Lemma  2.3.5  that  the  absolute  error  is  less  than  1.5 
Cq.    Similarly,  (2.3.18)  is  a  monotonic  function  of  e|^_2  and  the  third 
integral  can  again  be  approximated  by  a  summation  with  an  absolute 
error  less  than  1.5  eg.    Thus,  by  continuing  this  process,  one  has  the 
total  error  less  than  (l.Sk-Deg  in  absolute  value.    The  theorem  is 
proven. 

When  the  three  theorems  are  combined,  the  absolute  error  between 

and  (2.3.4)  is  bounded  by  k(k+l)eQ/2  +  Ar"^^""^'^"-^^b  -a  )r^""^'^"^^ 

=  k(k+l)eQ/2  +  A(b^-a^)r""  for  two-sided  bounds  B^.    Although  this 

error  is  quadratic  in  k,  it  is  much  better  than  the  usual  exponential 

error  propagation  in  multiple  integrals.    As  shall  be  shown  in  the 

next  chapter,  the  actual  error  can  be  much  smaller  than  this  upper 

bound.    For  one-sided  B  ,  the  error  bound  becomes  (1.5k-l)e    +  AB  r"" 
-  *  Ok' 

where      =  (b^-aj^)  and  a*  is  a  lower  truncation  point  to  (-«,b^)  such 

that  P^{X^  <^a  }  is  negligible.    Similarly,  if  the  condition      =  0  in 

(2.3.1)  is  replaced  by  the  stationary  condition      ~  H{0,{l-<^^)~^) , 

then  z,  the  variance-covariance  matrix  of  X  becomes 
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-<j, 
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-<!> 
1 


k-1 

Thus,  the  term  (  ^    XA^'^^+o'^'' )+2xy^)  in  (2.3.8)  is 
i  =1    1  K 

equal  to  -2<('X|^_j+2Xj^  which  is  the  same  as  in  the  conditional 
case.    This  implies  that  Theorem  2.3.1  remains  valid  in  the  stationary 
case  and  since  the  proofs  of  both  Theorem  2.3.2  and  2.3.4  are 
independent  of  the  stationarity  of  X  and  the  distribution  function  of 
X^,  these  two  theorems  still  hold  even  though  the  condition  Xg  =  0  in 
(2.3.1)  is  removed.    Hence,  the  method  proposed  in  this  section  can 
still  be  used  to  find  the  coverage  probability  for  model  (2.3.1)  when 
the  condition  X^  =  0  is  replaced  by  the  stationary  condition  X^  ~ 
N(0,(l-<|.2)-1). 


CHAPTER  III 

THE  COVERAGE  PROBABILITY  OF  PREDICTION  INTERVALS 
IN  AR(1)  MODEL 


3.1    Computation  and  Simulation  Results 
In  this  section,  we  first  compute  the  coverage  probability  of  95% 
prediction  intervals  for  a  first  order  autoregressive  process  given  Xq 
=  0.    Using  the  standard  prediction  method  in  time  series  [e.g..  Box 
and  Jenkins  (1970,  Chapter  5)]  gives  B^-  =  (-1.96a. ,l. 96a ^. )  where  a?  = 
(l+'t>^+...+<t)^^^"^^).    The  values  of  p|^  computed  by  the  algorithm 
proposed  in  the  last  section  of  the  previous  chapter  from  various  <^  = 
r/s,  n,  and  k  are  given  in  Table  II.    The  values  in  the  parentheses 
are  the  same  p|^  estimated  by  10,000  simulations  for  each  case.  The 
simulation  was  performed  on  an  IBM  4341  computer  with  its  normal 
random  number  generator. 

From  Table  II,  one  can  see  that  the  coverage  probabilities  for 
various  <^'s  do  not  differ  very  much  for  small  k,  but  the  gaps  are 
widened  when  k  becomes  large.    Thus,  caution  should  be  employed  when 
these  prediction  intervals  are  used  for  model  validation. 

To  evaluate  the  error  bound,  we  take  the  case  <^  =  0.5  for 
example.    According  to  Theorems  2.3.1  and  2.3.2,  error  bound  for  k  = 
10, n  =  11  is 


|e^|  <  55eQ  +  [(s/irr)  +  0.302]B^r 
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where  s=l.  r=2,  B,^  =  2xl.96x(l+*^+. . .+(t)^^)    ,  \e^\  <  QAr'^^,  or 
kj^l  <  0.011.    The  simulation  error,  when  evaluated  by  two  standard 
deviations  of  the  estimate  from  10,000  independent  Bernoulli  trials, 
is  less  than  0.01.    Thus,  it  is  apparent  that  the  real  error  e|^ 
propagates  much  slower  than  0{k^)  as  stated  in  Theorem  2.3.2.  Since 
most  of  the  coverage  probabilities  for  k  <  10  by  our  algorithm  and  the 
simulation  differ  much  less  than  0.01,  it  seems  that  the  error  bounds 
are  overestimated  and  the  actual  errors  by  our  algorithm  are  actually 
very  small.    Unfortunately,  we  were  unable  to  improve  our  error 
bounds. 


3.2    Appl i cations 
Table  II  can  also  be  used  to  investigate  the  change  of  the 
coverage  probability  when  the  observation  gap  is  changed.    It  is  well 
known  that  if      is  a  AR(1)  process,  the      =        is  again  an  AR(1) 
process  with  Y^-<|)2y^_^  =  ^2t'^*^2t-l-         would  expect  that  for  a 
highly  correlated  process.  Cy(k)  =  P^i'f^eBl^,  t=l,2,...,k}  can  be  used 
as  an  approximation  to  C^lk)  =  P^ih^hv  t=l,2, . . . .k},  where  are 
the  95%  prediction  intervals  for  Y^.    This  comparison  would  be  useful 
in  quality  control  when  the  qualities  of  items  are  highly  correlated 
in  a  product  line.    Since  the  coverage  probability  of  prediction 
intervals  at  a  fixed  confidence  level  is  independent  of  the  variance 
of  the  white  noise  process  in  an  AR(1)  model,  several  C  (k)  and  C  (k) 

y  X 

can  be  compared  by  using  Table  II.    For  example,  for  ^  =  0.9,  = 
0.8,  Cy(lO)  =  0.73  and  C^(IO)  =  0.65,  and  for  .j,  =  0.7,       =  0.5, 
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CydO)  =  0.64  and  C^(IO)  =  0.50.  These  gaps  seem  quite  substantial 
even  for  <),  =  0.9. 

The  present  method  can  also  be  used  to  find  the  first  passage 
time  distribution  for  model  (2.3.1).    Let  P|^  and      be  defined  as 
before  and  let 

T  =  min  {  t:       £  B^}  . 

Then  the  distribution  of  T  can  be  obtained  by 

Figure  3.1  shows  the  values  of  Pr{T  =  k}  for  ^  =  0,  0.5,  0.75, 
and  0.9  with  B^  =  (-1.96a^,1.96a^) .    It  is  interesting,  but  not 
surprising,  to  see  that  the  first  passage  time  distribution  changes 
its  shape  from  an  independent  to  a  dependent  process. 
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a.  12- 
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Figure  3.1  First  passage  time  distribution  for  an  AR(1)  model.  The 
*  values  for  curves  1,  2,  3,  and  4  are  respectively  0.0, 
0.5,  0.75,  and  0.9. 


CHAPTER  IV 
MODEL  BUILDING  FOR  JITTERY  COSINE  WAVE 


4.1  Introduction 

Many  time  series  exhibit  a  variation  which  is  in  some  way 
periodic.    Some  of  them  have  variations  at  a  fixed  period  due  to 
seasonal  effect  or  due  to  some  other  physical  cause.    For  example,  the 
weekly  minimum  temperature  at  New  Orleans,  Louisiana,  from  1980  to 
1983,  whose  plot  is  in  Fig.  4.1  presents  a  variation  which  is  annual 
in  period.    Some  others  exhibit  oscillations  which  do  not  have  a  fixed 
period.    The  Wolfer's  yearly  sunspot  data  from  1749  to  1924  are  good 
examples  of  this  case.    The  plot  in  Fig.  4.2  shows  that  the  numbers  of 
years  between  two  local  minimum  are  11,  9,  9,  14,  12,  13,  10,  ...  , 
and  these  are  quite  different  from  each  other. 

In  analyzing  data  with  variation  at  a  fixed  period,  taking  dif- 
ference between  data  to  remove  the  periodicity  is  one  of  the  most  com- 
mon methods  (e.g..  Box  and  Jenkins,  1976,  Chapter  9).    Decomposing  the 
data  into  a  sum  of  deterministic  periodical  functions  and  a  random 
component  is  another  one  (e.g.,  Anderson,  1971,  Chapter  4).    Since  the 
first  method  implicitly  requires  a  fixed  period  and  the  second  one 
also  explicitly  shows  that  the  fixed  period  is  necessary,  both  of  them 
are  inadequate  in  analyzing  cyclical  data  without  a  fixed  period. 
Some  authors  (e.g.,  Moran.  1954;  Box  and  Jenkins,  1976;  Woodward  and 
Gray.  1978)  have  used  the  ARMA  model  to  fit  the  sunspot  data.    This  is 
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Figure  4.1    Weekly  minimum  temperature  at  New  Orleans,  Louisana. 
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Figure  4.2    Wolfer's  yearly  sunspot  data. 
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obviously  inadequate  because  they  did  not  incorporate  the  existence  of 
periodicity  in  their  model. 

In  this  chapter,  we  start  with  a  simple  stationary  time  series 
model  which  is  basically  a  cosine  function  with  jitteriness  in 
frequency  w  from  time  to  time.    Formal  definition  of  this  model 
follows. 


4.2    Description  of  the  Model  and  Its  Stationaritv 
Let      =  ao  cos(a)Qt  +  9^)    t  =  0.1.2,  .  .  .  (4.2.1) 

where 

a^  >  0  and  toQe(o,2Tr)  are  real  constants. 


8^.  =    ^  e.  with 
^     i=0  ^ 

Eq  ~  U(0,2Tr)  and 

1  >  1}    are  i.i.d.  normal  random  variables 

with  zero  mean  and  variance  <J^, 

and  Eq  is  independent  of  {e^. ,  i  >  l}  . 

Since  cos(a.ot+et)  =  cos{{<^Qt+B^)mo(l{Zir)) ,  without  loss  of  generality, 
we  can  study  the  distribution  of  (tjQt+e^)mod{2Tr) . 
To  do  so,  we  first  give  the  following  lemma. 

Lemma  4.2.1       If  Z  ~  U(0,Zi^),  then  for  any  random  variables  X  and 
integer  i,  (iZ+X)mod(2Tr)  ~  U(0,27r). 
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Proof:     It  is  trivial  that  (iZ)mod(2Ti)  ~  \}{0,2^)  and  (iZ+x)mod(2ii)  ~ 
U(0,2Tr),  where  x  is  a  constant. 
Let  Y  =  (iZ+X)mod(2Tr),  then 
Pr{Y<a)  =  Pr{{iZ+X)mod(2^)<a) 

=  1^  Pr((iZ+x)mod(2,r)<a)dF(x) 
=  Pr({Z)mod(2Tr)<a)  dF(x) 
=  Pr((Z)mod(2Tr)<a). 
From  Lemma  4.2.1,  we  have 

(a)ot+e^)mod(2,r)  ~  U(0,2n)  t=l,2,...  . 

Moreover, 

((Ooi+ej)niod(2Ti)  ~  U{0,2tt)       i.j=1.2....    .  (4.2.2) 

j 

With  this  result  letting  a,-.  =  9.  -  9.  =     j         .  we  can  derive 

some  basic  properties  of  |X^}  defined  in  (4.2.1)  as  follows. 
Throughout  the  rest  of  this  chapter,  let  Z  denote  a  random  variable 
uniformly  distributed  on  (0,2,^). 
Stationarity 

Theorem  4.2.2   The  process  {X^}  defined  in  (4.2.1)  is  a  stationary 
time  series. 
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Proof:    It  suffices  to  show  that 

Pr(X.  <a^,  X        <^2'""h.n    ^  Vl^ 
i  m 

■I-  m 

where  i,  n^^  <  02  <  .  .  .  <  and  m  are  positive  integers 
and  dij^,  i^,  ....  di^^^  are  real  constants. 

Since 

P'-U.  <a^.  X.       <a2.....  X.^^    <  a^^^) 
1  m 

=  Pr{aQCos(ia.Q+e.)  <  a^,aQCos((i+n^)a,Q+e.+A.^.^^^)  <  g,^, 
....  a„cos((i+n„)(D  +e.+A.        )  <  a  ,} 

ID 

=  Pr{aQCOs((iuQ+e^.  )mod(2Ti))  <  a^, 

aQCOs((iu)Q+6.)niod(27r)+n^a)Q+A.^.^^  )  <  a^2' 

....  ap,cos((ia)„+e. )inod(2Tr)+n  oj.+A.  .     )  <a  ,} 
u  0    1  m  0    1.1+nj^   -  m+1^ 

=  Pr{aQCos(Z)  <  a^.  aQCOs(Z+n^a,Q+A. ^  .^^^)  <  i^,,,, 

anCos(Z+n  w.+A.  .     )  <  a    J  . 

m 

Similarly, 

Pr{X.     <a,.X.  <  a^,...,  X.^^^  <a 

1+t  -    1     i+t+n,  -    2'  1+t+n  - 

i  m 
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=  Pr{aQCOs(Z)  <  a^.  aQCos(Z+n^a)Q+A.^^^  .^^^^^)  <  a^. 

....  aoCos(Z+n^a,o+A.^^^.^^^^  )  <  a^^^} 

m 

=  Pr(aQCos(Z)  <  a^,  ao*^°^^^"^"l''o'*'^i  .i+n^^  -  ^2' 

a„cos(Z+n  (d„+a.  .      )  <  a    ,}  . 
0  m  0    i,i+n|^'  - 

Hence,  the  process  {X^}  defined  in  (4.2.1)  is  a  stationary  process. 
Moments  and  Autocorrelation  Function 
The  mean  is 

EX^  =  EagCosdijQt+e^)  =  EaQCos(Z)  =  0        t=l,2....  . 
The  autocovariance  function  is 

r(K)  =  E{X^X^^^}  -  (EX^)2 

=  E{aQC0s(a)Qt+9^)  aQCos(a)Q(t+K)+e^^|^}  -  0 

=  a^E{cos(Z)+cos(Ka)Q+A^^^^^)}/2 

=  (a2cos(a.QK)e"^^  'hll  K=0,1.2....  . 

Then, 

var(X^)  =  r(0)  =  a^/2  . 
So,  the  autocorrelation  function  is 
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2 

p{K)  =  (EX^X^^^)/var(X^)  =  cos(a,QK)e"'^^ 


The  expectation  of  the  third  order  cross  product  is 


^hhnhM,  =  a^E{cos(a.ot+9^)cos(a,o(t+£)+e^^^) 


C0S(ajQ(t-Hn)+9^^)} 


a3E{cos{3tn+n,)u.Q^^+e^^^+e^^) 


+cos((t+i-m).Q+e^+e^^^-9^^J 


+cos((tn+n,)<oo+e^^^-9^+e^^) 


+cos((t-£+m).Q^^^-e^^^-e^)}/4 

=  0  . 


The  Spectrum  Function 
By  definition. 


f(x)  =  2(1+2   z  p(k)cosxk)  0  <  X  <  TT 

k=l  "  " 


00  2 

2+4    E  cos(a)nk)cos(xk)e"'^'^ 
k=l  ^ 


CO  2 

2+2    z  Q~^°  ^^[cos{{ain+A)k)+cos((a,n-A)k)], 
k=l  "  ^ 


and  since 
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"r^  ^'^rncuv    -  l-pcosx-p"cosnx+p""*'^cos(n-l)x  1-pcosx 
L    p  cosKx  K  ->•  ^  as 

I<=1  l-2pcosx+p  l-2pcosx+p 


we  have 


2  2  2 

f(X)  =  2+2[(l-e"''  /^cos(a)Q+X))/(l-2e'''  ^hos{^Q+\)+e"'  )  + 

2  2  2 

d-e"""  ^^cos(a)Q-X))/(l-2e"''  ^^cosi^^-X)-^'''  )]  .  (4.2.3) 


It  is  wen  known  (see  e.g.,  Chatfield,  1980,  Chapter  4)  that  the 
spectrum  function  of  a  deterministic  sinusoidal  perturbation  which  has 
a  definition  similar  to  (4.2.1)  except  that 


9^.  =        for  any  i  >  1  , 


has  its  maximum  value  at  X  =  uq.  But  this  is  not  true  in  our  model 
because 


9f(x) 
3X 


-a^/2  -a^ 
^^^^  =  e        sin2ug(e     -1)  >  0     for  cuq     0  . 


To  find  the  X  value  which  maximizes  f(x)  is  quite  difficult.  Instead 
of  doing  so,  we  give  two  plots.  Fig.  4.3  and  Fig.  4.4.,  to  illustrate 
some  behaviors  of  f(x). 

Figure  4.3  is  a  plot  of  f(X)  when  a2  =  i.o  and       =  o.4,  from 
which  one  can  see  that  f(X)  does  not  have  the  maximum  value  at 
^  =  (^0-    Figure  4.4  is  a  plot  of  oi*{a^,uiQ)  when  'Jq  =  0-4,  where 
"*(^^,'^0^  ^  v^l'^e  which  maximizes  f(^).    From  this  plot,  one 
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Figure  4.3    Spectrum  density  of  model  (4.2.1)  when 
=  0.4. 


=  1.0  and 
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0.025 


1."         l.S         1.3         2. a 


Figure  4.4    Plot  of  <^*(<J^,0.4)  vs 
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*    2  9 
can  see  how   oj  (a^,o.4)  departs  from  ojq  when  cr^  gets  larger.  Since 

f{X)  does  not  have  its  maximum  value  at  X  =  Uq,  the  classical  spectrum 

analysis  cannot  be  used  here. 

4.3   Minimum  Mean  Square  Error  Predictor 
By  a  well-known  result,  the  mean  square  error  of  the  £-step 
predictor  of  X^+j^  at  time  t,  given  the  knowledge  of  the  model  and  all 
the  X's  up  to  t,  is  minimized  when  it  is  equal  to  the  conditional 
expected  value  of  X^^^^.    Thus,  the  £-step  minimum  mean  square  error 
predictor  at  time  t  is 

=  E{^+JaQ,a>Q,a2.X^.X^_^....} 

=  E{aQC0s((a>Q(m)+e^+A^^^^^)|a.Q.aQ.9^.a2)} 

=  age        ^^C0S(a)Q(t+jl)+9^)  . 

Let  e^(£)  denotes  the  £-step  prediction  error  at  time  t,  then 

2 

n^^^  =  hn  -  ^0^'^''  ^^cos((OQ(t+£)+9^)  , 

and  the  variance  of  e^(ji)  denoted  as  "^{e^U))  is 
V(e^(£))  =  E{X^^^  -  aoe-^'^Hos(a,Q(t+£)+e^)}2 
=  E{(X^^^)2}  -  a2e-^%os2(coQ(t+£)+e^) 
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=  (a^/2)(cos(2a,Q(t+£)+28^)e'^^%l)  -  a^e"^""  cos^(a)Q(t+a)+e^) 

=  a^cos2(2(t+£)o,Q+29^)e"2^^  -  a^e"^*"  /2  +  a^/2  - 

a^e  *  cos(2(t+ii)ajQ+2e^ 
2  2 

=  a^d-e'*''  jCe"^*"  (1/2  -  cos^(  (t+Ooig+e^)  )+l/2) 
2  2 

=  a^(l-e"^^  )(-e"^''  cos(2(t+)i)a,Q+29^)+l)/2  .  (4.3.1) 

In  the  ARMA{p,q)  process  V(e^(£))  goes  to  the  variance  of  X  when  i 
goes  to  infinity.    This  is  also  true  in  the  process  defined  by 
(4.2.1).    But  as  can  be  seen  in  (4.3.1),  V(e^(Jl))  has  maximum  value 
while  cos((t+«.)i«)Q+8^)  =  o  and  minimum  value  while  cos( (t+A)wQ+9^)  = 
±1.    So  V(e^(A))  depends  on  t  while  it  is  independent  of  t  in 
ARMA(p,q)  process.    This  is  due  to  the  fact  that  the  change  in  cosine 
wave  is  small  when  its  angle  is  close  to  0  or  w  and  is  large  when  its 
angle  is  close  to  u/Z. 

Another  difference  about  V(e^(£))  between  these  two  processes  is 
that  in  the  ARMA(p,q)  process  V(e^(ji))  increases  when  i  increases,  but 
this  is  not  necessarily  true  in  the  process  defined  in  (4.2.1).  In 
order  to  illustrate  how  ^(e^{i))  behaves  when  z  increases  while  t,  coq 
and  a  are  fixed,  let  us  assume  that  tt  =  oiqIc  where  k  is  some  positive 
integer,  and  let  A^^^^  denote  cos(2(t+£)a)Q+28|.).  Then 
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I 


V^'^  +  mk  Di  =  1,2,..., 


a2[e-'"'(l-e-'"^^)(l-A^_,(e-'°'(l^-«il))l/2, 


which  is  greater  than  zero,  because 


■1  <  e-'^^l+e-"''^^  -  1  <  1  and 


That  is,  V(e.j.(«.))  is  monotone  increasing  from  period  to  period. 

In  the  case  of  l'  =  l  +  j  j=l,2. . . . ,k-l  : 

(i)    If  A     .<  A.  »  then 

(l-e-^*^)(l-e-^'^A     .)  >  (l-e-^'^)(l-e-^'^V  J  . 
Hence, 

V{e^(^'))  >  V(e^(£))  .  (4.3. 

(ii)    If  A     >  ^         then  (4.3.2)  might  not  hold;  for  example, 

when  A     ,  =  1  and  A     .  =  -1  then  V(e. ( £* ) )-V(e^{ £) ) 
t,^  t,i  t  t 

Ml-e-('*"°')(l-e-'^^'°')-(l-e-^°')(l«-'°') 
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which  is  less  than  zero  if  k  <  Therefore,  function  V(e^(£))  is  no 
longer  monotone  increasing  in  z  within  period. 

Some  plots  of  V(e^(ji))  shown  in  Fig.  4.5  give  a  clearer  picture 
of  what  Y(e^(£))  would  look  like. 

In  Fig.  4.5, 

line  4  is  the  plot  of  V(e^{il))+3  computed  when 
2  2 

aQ  =    2,  a   =  0.1,  Uq  =  ir/20  and  2e^+2ta)Q=  0.2, 

line  3  is  the  plot  of  V(e^(ji))+2  computed  when 
2  2 

aQ  =    2,  a   =  0.1,  ojq  =  ir/10  and  29^+2ta)Q=  0,2, 

line  2  is  the  plot  of  V(e^{ji))+1  computed  when 
2  2 

aQ  =    2,  a   =  0.1,  ojq  =  Tr/20  and  2e^+2t(i)Q=  0 


and 


line  1  is  the  plot  of  V(e^(£))  computed  when 
2  2 

aQ  =    2,  ff   =  0.1,  ojq  =  tt/IO  and  29^+2ta)Q=  0 


4.4   A  Special  Case 
While  it  is  true  that  a  given  model  has  a  unique  autocovariance 
structure,  the  converse  is  not  true.    In  this  section,  we  give  a 
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Figure  4.5    Plot  of  V(et(^))  for  model  (4.2.1). 
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special  case  of  mode!  (4.2.1)  which  has  the  same  autocovariance  as 
that  of  an  AR{1)  model . 

Let  {Y^}  be  a  process  defined  by  (4.2.1)  with      =  0  and  {X^}  be 
an  AR(1)  process  defined  as  follows: 

^t  ~  *^t-l     ^t     '  ^^^^^ 
Ul  <  1  and 

{E^}  are  i.i.d.  normal  random  variables  with  zero  mean 
2 

and  variance  a^. 

2  2 
If  *  =  e"^      and  c|  =  agd-e"''  )/2,  then  {Y^}  and  {X^} 

have  the  same  autocovariance  structure;  that  is. 


2 

Px(k)  =  /  =  (e"^^2)^  =  pY(k)  and 

2  2 

Or  a,, 

var(X^)  =        „    =      =  var(YJ  . 
^      (1-/)       ^  ^ 

The  minimum  mean  square  error  £-step  predictor  of  X^^.^  is 

which  has  the  same  form  as  the  minimum  mean  square  error  £-step 
predictor  of  Y^^^  because 

=  aoe-^^^/2^os(9^)  =  e'^^^^^y^  ^  ^t^^  ^ 
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The  variances  of  the  rstep  prediction  error  for  these  models  are 
var(X^^^-x\(£))  =  V*^'(|  =  a^(l-e"^^^/2 

and 

var(Y^^^-Y^(£))  =  a^(l-e■^'^^  (l-cos(29^)e"^^^/2  . 
In  addition, 

var(Y^^^-Y^(ji))  >  var(X^^^-X^(    )  when 

„  2 

l-cos(2e^)e        >  1 
<=>  cos(28^)  >  0 

<=>  2cos^(9^)  -  1  >  0 

<=>  Y^  >  Bi^AA  or  Y^  <  -ag/v^ 

and 

Pr(|Y^|>aQ/^)  =  Pr(|cos(a^t+e^)|>l/V?")  =  1/2  . 

Hence,  by  the  criterion  of  the  variance  of  the  rstep  prediction 
error,  we  cannot  say  exactly  which  model  is  better. 

4.5    Consistent  Estimators 
It  would  be  desirable  to  find  consistent  estimators  for  the 
parameters  of  model  (4.2.1)  by  utilizing  all  available  data.    But  v» 
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are  unable  to  do  so.  However,  we  find  a  consistent  estimator  for  the 
parameter  Aq  by  using  all  observations  and  a  consistent  estimator  for 
the  parameter  oig  by  using  a  proportion  of  the  observations. 

Lemma  4.5.1     Let  a   =   inf  ((a)„t+e.  )mod(2TT)),  then 

"     l<t<n     "  ^ 

0     a.s.      as  n-H»  . 

Proof:     Since  {(a)Qt+e^)mod(2ir) ,  t>0}  is  a  random  walk  on  a  circle  and 
eq  is  the  starting  point  of  the  process,  without  loss  of 
generality,  we  assume  that  Gq  =  0  and  write 

^n  ~  (^-Ij^E^- )fnod(2Tr)         where  e^.  =  e^-+u)Q  , 
then 

(X-oiQn)^ 

Pr{|S  |<6}  >  /^e  /  U2^l^^  a      >  CJn^^ 

0  ^ 

where       >  0  V  t  . 

Thus,  we  have 

CO 

E  Pr{|S  |<5}  =  CO  . 
n=l 

Then  following  the  proof  given  by  Chow  and  Teicher  (1978, 
example  4.2.1),  we  can  prove  that 


P(|SJ  <  6   i.o.(n))  =1         V  6  >  0 
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This  implies  that 


lim  inf       S    =0  a.s.  . 


Thus     »n     0  a.s.      as  n>»  . 


Lemma  4.5.2     Let  a^  =  max  {|XJ,  IX2I....,  |XJ),  then 


a^  +  aQ   a.s.         as  n  ->•  »  . 


Proof:     If      =  {u:  lim  a    =  a^},  B„  =  {w:  lim  a   =  0}  . 

nx»  n-H»> 

then       c  B^.    So,  by  lemma  4.5.1, 
Pr(B^)  >  Pr(B2)  =  1  . 

Hence,  by  lemma  4.5.2,  a„  is  a  consistent  estimator  for  parameter  ag. 

Before  giving  a  consistent  estimator  for  ojq,  we  state  a  theorem 
which  was  given  by  C.F.  Wu  in  1981. 

Theorem  4.5.3 

Let  9^  be  an  estimator  of  Sq  which  is  based  on  the  minimization 
or  maximization  of  a  function  5^(9).    Then  for  any  6  >  0,  if 


lim  inf^^  inf|Q_9^|,5  ^^n^^^'^n^^O^ ^  '  °  P^o^) 
then  9^  +  9q  a.s.    (in  prob)    as  n+«  . 
Since  {e^  t>l}  appears  in  model  (4.2.1)  only  through  the 


cosine 
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function  which  is  a  periodical  function  of  period  Ztt,  without  loss  of 
generality,  we  condense  all  the  probability  of  in  the  real  line  to 
(-tt.tt);  that  is,  the  density  function  ft(x)  of  the  condensed  variable 


*  . 

^t 


ft(x)  =    E    g^{x+27iK)  -TT  <  X  <  TT  ,  t=l,2,..., 

K=-oo 


where  g^(x)  is  the  density  function  of  e^. 

Then  the  process  {X^}  defined  in  (4.2.1)  can  be  redefined  by 
substituting  {e^,  t>l}  by  {e^,  t>l}.  Since 


X  X 
(9  +0)  t)mod(2Tr)  =  cos"-^      or  27r-cos"-^  —  ,  and 
^  *0  *0 


X  X 
(Qt-i+"n(^-l))'"od(2Tr)  =  cos"-^  ^^or  2Tr-cos"-^  . 


* 


^+a)Q  may  be  equal  to  one  of  the  following  values: 
cos"-^(X^/aQ)  -  cos""^(X^_^/aQ)  or 
2ir  -  cos"-^(X^/aQ)  -  cos"-^(X^_^/aQ)  or 
2Tr  -  cos"-^(X^/aQ)  -  (27r  -  cos"-^{X^_^/aQ)  or 
cos'^(X^/aQ)  -  (2ir  -  cos"-^(X^_^/aQ))  . 
Thus,  we  can  write 
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+l3^(cos  X^i^/aQ-cos~^X^/aQ)+l4^(cos"^X^/aQ+cos"^X^_^/aQ-2Tr) 

(4.5.1) 

where 

{I^-^.  i=l,2,3,4}  are  zero-one  random  variables  whose  values 
depend  on  and  i^(t-l)+e^_^,  and  only  one  of  I^.^'s  is 

equal  to  1  at  each  time  t. 

To  find  a  consistent  estimator  of  uq  through  (4.5.1)  without 
knowing  the  value  of  I^-^'s  is  very  difficult.    Instead  of  doing  this, 
we  would  like  to  find  a  consistent  estimator  for  oiq  conditional  on  the 
knowledge  of  {l^-^,  t>l }. 

But  it  is  impossible  to  identify  I^.^'s  for  all  t.  especially  when 
either  X^  or  X^.^  is  near  by  ag  or  -ag.    Thus,  we  give  an  estimator  of 
by  using  only  those  X^'s  of  which  1,-^  can  be  identified.    And  it  is 
possible  to  identify  I^.^  for  some  t  when  both      and      are  relatively 
smaller  than  tt;  for  example,  if      =      =  O.Itt  then 

Pr(|  e*+a^|>0.5TT)  ~  Pr(|e^+a^|>0.5Tr)  =  l-$(4)  +  $(-6)  =  0.00003  . 
Thus 

Pr(l2^=l  or  I^^=l|  lcos"^X^/aQ-cos"^X^_j/aQ|<0.5TT)  =  0.00003 

and  either        =  1  or  I^^  =  1  can  be  determined  by  the  direction  of 
the  movement  of  data  around  t. 
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Let  T  =  {t:    l<t<n  where  I^.^  are  known},  and  let      be  the  number 

of  points  in  set  T.    And  for  fixed  n,  n^  decreases 
2 

when  a   or  ui^  increase.    Now  an  estimator  of      based  only  on  those 
X/s  where  ieT  is  given  as  follows: 


T  1  eT 

"  . ^^^i ^^i *^i-l'^o^^"T'  ^^^^^ 

f^.(X..X._^.aQ)  =  cos'^X./aQ-cos"^X..^/aQ  or  2TT-cos'^X./aQ-cos"^X._^/a( 
or 

cos'^X._^/aQ-cos~^X./aQ     or    cos'^X./aQ+cos"^X..^/aQ-2Tr  . 

Theorem  4.5.4 

<^P^(aQ)  is  a  consistent  estimator  of  to^. 


Proof:     It  can  be  seen  that  o)^  (a^)  is  an  estimator  which  minimizes 


And 


2  2 


S„('^)-S^(caQ)  =  -  i:  s    +  E  (s    +0,  -0.) 
1  eT       1  eT 


2  *2 
i  eT    °  0      i  eT  ^ 
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and  from  the  definitions  of      and  e*»  we  have 


E  e^.  =  0,  warU.)  <  a    <  »  . 


* 


c  /  \_o  /     \  2(0)^-0))  E  e. 

Thus,  we  have   „  "    "    =  (a)„-a,)^  li!—  >  0  a.s. 

nj  u  nj 

for  any  m^ui  ^  Then,  by  Theorem  4.5.3,  we  can  conclude  that 
u)p^(aQ)  is  a  consistent  estimator  of  ojq. 


We  now  prove  that  oip^Cap)  is  a  consistent  estimator  of  ojq  when 
all  X^. ,  ieT,  are  bounded  away  from  a^.    The  Taylor  expansion  of 


A  A 


^0  ^'^ 


a=a*  '  ^'^^'"^  ^*^'^^V^0^ 


irX^^)  =.-|^(l/n^).z  f,.(X.,X._^.a)  . 


Since  ^  cos"^  ^  =  -X^/(a v7^).  it  can  be  seen  that  ^ 
a)  has  the  form 


8it^/(Wa^-X^)+32tXt-i/(aVa^-X^_l) 

where 

^It  ^      °^  ~^        ^2t  ^  ■'■^  °^  "1  depending  on  t. 
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and  by  lemma  4.5.2,  a    -»■  a„     a.s.  , 

n  0 

then  we  have  a*  >  a^     a.s.  ,     because  a*e[aQ,aQ]  . 

These  results,  together  with  the  assumption  that  X^'s  are  bounded  away 
from  aQ,  imply 


a  ^a    -X^     <  »       and       a  Wa    -X^_^     <  « 


Thus 


*  is  less  than  infinity. 


a=a 


Hence 


a=a*  ^0 


And  by  theorem  4.5.4,  '^^^(3^^)  >       a.s.  ;  therefore,  we  can 


conclude  that  oi    (a  ) 


A  A 


'n^^^n^  *  "o  '^^^^       '^n^^^n^      ^  consistent 


estimator  of  ui^. 


CHAPTER  V 

FUTURE  RESEARCH:    GENERAL  MODEL  AND  SUNSPOT  DATA 

5.1    General  Model 
The  model  proposed  in  section  4.2  includes  only  one  frequency  and 
is  a  rather  simple  model.    To  explain  some  phenomena  in  the  real 
world,  we  may  need  a  more  complicated  model  that  contains  more  than 
one  frequency  component.    A  model  more  general  than  (4.2.1)  can  be 
defined  as  follows: 

m 

'^t  "  "O'^.f  ^j^^^j^'^O^'^t^)  (5.1.1) 

where 

Oq  and  a^s  >  0  are  real  constants,  and  oi^,  9^  are  the  same 
as  those  defined  in  (4.2.1). 

Then  process  {X^}  defined  in  (5.1.1)  is  again  a  stationary  process. 
The  proof  is  omitted  since  it  is  identical  to  the  proof  of  the 
stationarity  of  model  (4.2.1).    Similar  properties  are  derived  as 
fol lows: 

The  means  of  X^  is 
EX^  =  . 
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The  autocovariances  are 

r(k)  =  EX^X^,^-(EX^)2 


m     5  .2.  2,„ 

=    z  (aV2)cos(jka),Je"J  , 
j=l    J  " 


m  5 

var(X  )  =  r(0)  =    Z  a^/2  . 
^  j=l  J 


The  autocorrelation  function  is 

p(lc)  =    s  a^cos(jkco„)e"J      '^1  z  . 
j=l  ^  "  j=l  J 


The  spectrum  function  is 

m  m  5 

f{X)  =  2+2  E  f  .(X)/  z  a.  0  <  X  <  TT  , 

j=l  J       j=l  J  ■  ■ 

where 

fj(X)  =  a2{(lVJ^^^/2^os(ja^+X))/(l-2e"J^^/2^os(ja:jj+X)+e"J^'^ 

•2  2  2  2  2  2 

+(l-e"J  ^  ^^cos(ja^-x))/(l-2e""^     ^^cosCjo^-xj+e'^'  ^)}  . 

But  the  method  of  estimating      and  aj's  in  section  4.5  cannot  be 

easily  extended  to  model  (5.1.1),  because  it  includes  more  than  one 

a^.  and  to^  cannot  be  extracted  easily  by  inverting  the  function 
m 

z  a  cos(o(a)  t+e J).  Parameter  estimation  for  model  (5.1.1)  has  not 
been  done,  but  it  warrants  further  research. 
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5.2   Modeling  Sunspot  Data 
Wolfer's  sunspot  data  have  attracted  many  authors'  interest. 
Woodward  and  Gray  (1978)  summarize  models  proposed  by  other  authors 
and  gave  three  new  models;  they  are 

(1-1.64B+B^)(1-.29b'^-.21B^-.19B^)X^  =  (l-.34B)a^,  (5.2.1) 

( 1-1 . 64B+. 94B^) ( 1-. 18B-.2B^-. IB^-. 26B^-. 15B^-. 16B^)X . 

=  (l-.58B)a^  (5.2.2) 

and 

(1-1.66B+0.96B^)X^ 

=  {l-.39B-0.09B^-.09B^+.25B'^+.llB^+.18B^)a.  .  (5.2.3) 

Woodward  and  Gray  compare  these  models  to  the  well-known  ARMA(2,0) 
model  suggested  by  the  Box  and  Jenkins  method  and  claim  that  their 
models  are  better. 

As  mentioned  in  section  3.1,  Wolfer's  sunspot  data  present  some 
kind  of  periodicity  with  unequal  period  from  time  to  time;  thus  a 
model  like  (4.2.1)  may  be  a  good  approach  to  it.    In  reviewing  Fig. 
2.2,  we  also  notice  that  the  local  maximum  of  Wolfer's  sunspot  data 
fluctuate  as  a  cosine  function.    So  we  substitute  ag  of  model  (4.2.1) 
with  another  jittery  cosine  wave  to  form  a  new  model,  and  use  the  new 
model  to  fit  the  sunspot  data.    The  model  is  defined  as  follows: 
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=  a+e(l  +  7C0S(a,^t+e^))(C0S{a)Qt+e^)+l)  (5.2.4) 

where 

ot,  3  and  y  ai*e  some  positive  real  constants. 


t 

"  S  is  defined  as  in  (4.2.1)  and 
^     i=0  ^ 


^i.  =     E  e--  with 
^  i=0^ 

^  ~  U(0,2Tr),  and 
* 

i>l}  are  i.i.d.  normal  random  variables  with  zero  mean  and 

.       .         2  * 
and  variance  ay  also  all  e^'s  and  e.'s  are  mutually  independent. 

By  following  the  steps  of  proof  in  theorem  4.2.1.,  we  can  prove  that 
the  process  {Y^}  defined  in  (5.2.4)  is  a  stationary  process  too.  The 
autocovariance  of  {Y^}  is  derived  as  follows: 

E(Y^)  =  o+s  t=l,2,... 

r(k)  =  EY^Y^^^-(EY,)2 

2 

~k(y  /2  2 
=  6^{(Y^e     ^    cos(^k))/2+(e"'^<^ /2cos(,^k))/2+ 

Lye  cos((a,^+a^)k]/8+CYS  cos(  ( <^-<^)k]/8 } 
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=  6  {(y  e         cos(a)^k)+e  '^'^  '  cosioigk) )/2+ 
2  -k(a^+a^)/2 

We  cos(a)^k)cos(a)Qk))/4}  k=0,l,2,...  ; 

thus 

var{Y^)  =  r(0)  =  (1/2  +3Y^/4)e^ 
and  the  autocorrelation  function  is 

p(k)  =  {e""^^ /2^.Qs(a)Qk)(l+^e  cos(coj^k))/2+ 

2  ''^<^i/2  P 
(ye  cos(ajjk))/2}/(l/2  +3y /4)    .  (5.2.5) 

Although  it  is  mentioned  in  section  4.4.  that  two  different 
stationary  processes  may  yield  the  same  autocorrelation  structure, 
comparing  the  sample  autocorrelations  from  observed  data  with  the 
theoretical  autocorrelations  of  the  model  is  one  of  the  popular 
methods  to  check  the  adequacy  of  any  proposed  models.    Thus,  we  fit 
the  sample  autocorrelations  of  sunspot  data  to  equation  (5.2.5)  and 
obtain  the  estimated  theoretical  autocorrelations  of  model  (5.2.4)  for 
sunspot  data.    Since  Woodward  and  Gray  just  give  the  plot  of  auto- 
correlations of  their  models.  Procedure  MATRIX  of  SAS  package  is  used 
to  produce  these  autocorrelations.    The  sample  autocorrelations  of 
sunspot  data  and  the  theoretical  autocorrelations  of  model  (5.2.1), 
model  (5.2.2)  and  model  (5.2.3)  together  with  the  estimated  theore- 
tical autocorrelations  of  model  (5.2.4)  are  given  in  Table  5.1.  The 
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Table  5.1 
Autocorrelations  of  sunspot  data. 


P^{K) 


P2(K) 


P3(K) 


P  (K) 


1 
2 
3 
4 
5 
6 
7 
8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 

43 


0.82000 
0.34480 
-0.25453 
-0.76223 
-0.99552 
-0.87043 
-0.43198 
0.16198 
0.69763 
0.98213 
0.91307 
0.51530 
-0.06797 
-0.62678 
-0.95994 
-0.94753 
-0.59400 
-0.02664 
0.55032 
0.92916 
0.97350 
0.66738 
0.12101 
-0.46893 
-0.89005 
-0.99076 
-0.73479 
-0.21430 
0.38334 
0.84298 
0.99914 
0.79562 
0.30567 
-0.29432 
-0.78836 
-0.99858 
-0.84932 
-0.39430 
0.20267 
0.72667 
0.98908 
0.89541 
0.47940 


0.82680 
0.45660 
0.03480 
-0.30690 
-0.50327 
-0.51216 
-0.34436 
-0.04550 
0.28499 
0.53938 
0.63784 
0.56097 
0.34368 
0.06021 
-0.20371 
-0.37271 
-0.40315 
-0.29406 
-0.08710 
0.14857 
0.33903 
0.42897 
0.39690 
0.25925 
0.06289 
-0.13056 
-0.26390 
-0.30125 
-0.23764 
-0.09870 
0.06883 
0.21251 
0.29026 
0.28234 
0.19590 
0.06123 
-0.07872 
-0.18195 
-0.21998 
-0.18556 
-0.09365 
0.02451 
0.13167 


0.79999 
0.38545 
-0.05773 
-0.36961 
-0.50195 
-0.44856 
-0.26273 
-0.00552 
0.24306 
0.40878 
0.44524 
0.34666 
0.14803 
-0.08706 
-0.28663 
-0.39223 
-0.37594 
-0.24752 
-0.04997 
0.15466 
0.30471 
0.35735 
0.30067 
0.15606 
-0.02958 
-0.19892 
-0.30181 
-0.31005 
-0.22493 
-0.07575 
0.09019 
0.22244 
0.28267 
0.25568 
0.15307 
0.00865 
-0.13260 
-0.22841 
-0.25187 
-0.19883 
-0.08826 
0.04436 
0.15837 


0.85974 
0.55218 
0.17673 
-0.15482 
-0.36234 
-0.37233 
-0.22707 
0.02355 
0.29349 
0.49799 
0.57826 
0.51649 
0.33760 
0.09839 
-0.13153 
-0.29099 
-0.34408 
-0.28837 
-0.15277 
0.01342 
0.15625 
0.23304 
0.22363 
0.13443 
-0.00546 
-0.15424 
-0.27028 
-0.32398 
-0.30540 
-0.22578 
-0.11293 
-0.00221 
0.07403 
0.09562 
0.05956 
-0.02004 
-0.11706 
-0.20152 
-0.24854 
-0.24526 
-0.19402 
-0.11090 
-0.02041 


0.81 
0.43 
0.03 
-0.26 
-0.40 
-0.36 
-0.17 
0.10 
0.34 
0.49 
0.50 
0.37 
0.17 
-0.04 
-0.18 
-0.25 
-0.24 
-0.19 
-0.10 
0.01 
0.12 
0.20 
0.20 
0.12 
-0.01 
-0.16 
-0.26 
-0.30 
-0.30 
-0.26 
-0.16 
-0.02 
0.09 
0.12 
0.07 
-0.04 
-0.15 
-0.25 
-0.30 
-0.30 
-0.24 
-0.15 
-0.03 
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Table  5.1-continued. 


Pl(K) 


P3(K) 


P^IK) 


P  (K) 


44 
45 
46 
47 
48 
49 
50 
51 
52 
53 
54 
55 


-0.10920 
-0.65848 
-0.97071 
-0.93349 
-0.56021 
0.01475 
0.58440 
0.94366 
0.96321 
0.63600 
0.07983 
-0.50508 


0.19612 
0.20091 
0.14798 
0.05651 
-0.04392 
-0.12279 
-0.15787 
-0.14141 
-0.08157 
0.00099 
0.08002 
0.13191 


0.22031 
0.21368 
0.14321 
0.03259 
-0.08338 
-0.16969 
-0.20165 
-0.17183 
-0.09166 
0.01281 
0.10925 
0.16906 


0.05209 
0.08783 
0.07976 
0.03406 
-0.03230 
-0.09711 
-0.13967 
-0.14688 
-0.11667 
-0.05817 
0.01159 
0.07303 


0.07 
0.13 
0.13 
0.11 
0.03 
-0.07 
-0.17 
-0.24 
-0.25 
-0.21 
■0.11 
0.03 


Theoretical  autX)correlations  of  model 

(5.2.1) 

Theoretical  autocorrelations  of  model 

(5.2.2) 

P^iK): 

Theoretical  autocorrelations  of  model 

(5.2.3) 

P^K): 

Estimated  theoretical  autocorrelations 

of  model  (5.2.4) 

P  (K): 

Sample  autocorrelations  from  data 
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corresponding  plots  are  shown  respectively  in  Fig.  5.1,  Fig.  5.2  and 
Fig.  5.3.    These  plots  show  that  model  (5.2.4)  can  fit  sunspot  data 
much  better  than  model  (5.2.1).  model  (5.2.2)  and  model  (5.2.3)--at 
least  in  the  sense  of  fitting  autocorrelations.    It  would  be  necessary 
to  develop  a  methodology  to  estimate  the  parameters  of  model  (5.2.4) 
and  predict  the  future  values  from  this  model.    But  we  are  unable  to 
do  it  at  this  moment.    This  could  be  an  interesting  topic  for  future 
research. 
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*  denotes  sample  autocorrelations  of  data 

□  denotes  theoretical  autocorrelations  of  model  (5.2.1) 

+  denotes  estimated  theoretical  autocorrelations  of  model  (5.2.4) 

Figure  5.1    Autocorrelations  of  sunspot  data  of  model  (5.2.1)  and 
others. 


*  denotes  sample  autocorrelations  of  data 

□  denotes  theoretical  autocorrelations  of  model  (5  2  2) 

+  denotes  estimated  theoretical  autocorrelations  of  model  (5  2  4) 


Figure  5.2    Autocorrelations  of  sunspot  data  of  model  (5.2.2)  and 
others. 
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*  denotes  sample  autocorrelations  of  data 

D  denotes  theoretical  autocorrelations  of  model  (5  2  3) 

+  denotes  estimated  theoretical  autocorrelations  of  model  (5.2.4) 

Figure  5.3    Autocorrelations  of  sunspot  data  of  model  (5.2  3)  and 
others. 
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